智能论文笔记

White-Box Adversarial Policies in Deep Reinforcement Learning

Stephen Casper , Dylan Hadfield-Menell , Gabriel Kreiman

分类：人工智能 | 机器学习

2022-09-05

针对AI系统的对抗性例子通过恶意攻击和通过对抗性训练提高鲁棒性的机会构成了风险。在多种设置中，可以通过培训对抗代理以最大程度地减少受害者的奖励来制定对抗性政策。先前的工作研究了黑盒攻击，在这种攻击中，对手只看到州的观察结果，并有效地将受害者视为环境的任何其他部分。在这项工作中，我们实验白盒对抗性政策，以研究代理人的内部状态是否可以为其他代理提供有用的信息。我们做出三项贡献。首先，我们介绍了白盒对抗性政策，其中攻击者可以在每个时间步长观察受害者的内部状态。其次，我们证明了对受害者的白框访问可以在两种经纪环境中进行更好的攻击，从而导致对受害者的初始学习和更高的渐近表现。第三，我们表明，针对白盒对抗性策略的培训可用于使在单一环境中的学习者更强大，以使域转移更强大。

translated by 谷歌翻译

Get It in Writing: Formal Contracts Mitigate Social Dilemmas in Multi-Agent RL

Phillip J. K. Christoffersen , Andreas A. Haupt , Dylan Hadfield-Menell

分类：人工智能

2022-08-22

多机构增强学习（MARL）是训练在共同环境中独立起作用的自动化系统的强大工具。但是，当个人激励措施和群体激励措施分歧时，它可能导致次优行为。人类非常有能力解决这些社会困境。在MAL中，复制自私的代理商中的这种合作行为是一个开放的问题。在这项工作中，我们借鉴了经济学正式签约的想法，以克服MARL代理商之间的动力分歧。我们提出了对马尔可夫游戏的增强，在预先指定的条件下，代理商自愿同意约束依赖状态依赖的奖励转移。我们的贡献是理论和经验的。首先，我们表明，这种增强使所有完全观察到的马尔可夫游戏的所有子游戏完美平衡都表现出社会最佳行为，并且鉴于合同的足够丰富的空间。接下来，我们通过表明最先进的RL算法学习了我们的增强术，我们将学习社会最佳政策，从而补充我们的游戏理论分析。我们的实验包括经典的静态困境，例如塔格·亨特（Stag Hunt），囚犯的困境和公共物品游戏，以及模拟交通，污染管理和共同池资源管理的动态互动。

translated by 谷歌翻译

Towards Psychologically-Grounded Dynamic Preference Models

Mihaela Curmei , Andreas Haupt , Benjamin Recht , Dylan Hadfield-Menell

分类：人工智能

2022-08-01

设计为与时间变化的偏好保持一致的内容的推荐系统需要正确地计算建议对人类行为和心理状况的反馈影响。我们认为，建模建议对人们偏好的影响必须基于心理合理的模型。我们为开发接地动态偏好模型提供了一种方法。我们通过模型来证明这种方法，这些模型从心理学文献中捕获了三种经典效果：裸露，操作条件和享乐调整。我们进行基于仿真的研究，以表明心理模型表现出可以为系统设计提供信息的不同行为。我们的研究对建议系统中的动态用户建模有两个直接影响。首先，我们概述的方法广泛适用于心理基础动态偏好模型。它使我们能够根据他们对心理基础及其难以置信的预测的有限讨论来批评最近的贡献。其次，我们讨论动态偏好模型对建议系统评估和设计的含义。在一个示例中，我们表明参与度和多样性指标可能无法捕获理想的建议系统性能。

translated by 谷歌翻译

Toward Transparent AI: A Survey on Interpreting the Inner Structures of Deep Neural Networks

Tilman Räukur , Anson Ho , Stephen Casper , Dylan Hadfield-Menell

分类：机器学习 | 人工智能 | 自然语言处理 | 计算机视觉

2022-07-27

机器学习的最后十年的规模和能力大幅增加，深层神经网络（DNN）越来越多地在各种领域中部署。但是，DNN的内部运作通常很难理解，这引起了人们对使用这些系统的安全性的担忧，而无需严格了解它们的功能。在这项调查中，我们回顾了有关解释DNN内部组成部分的技术的文献，我们称之为“内部”可解释性方法。具体而言，我们审查了解释权重，神经元，子网和潜在表示的方法，重点是这些技术如何与设计更安全，更值得信赖的AI系统的目标相关联。我们还强调了可解释性与工作之间的联系，对抗性鲁棒性，持续学习，网络压缩以及研究人类视觉系统。最后，我们讨论了关键的挑战，并争辩说未来的工作，以解释性为AI安全性，重点放在诊断，基准测试和鲁棒性上。

translated by 谷歌翻译

How to talk so your robot will learn: Instructions, descriptions, and pragmatics

Theodore R Sumers , Robert D Hawkins , Mark K Ho , Thomas L Griffiths , Dylan Hadfield-Menell

分类：人工智能

2022-06-16

从我们生命的最早几年开始，人类使用语言来表达我们的信念和欲望。因此，能够与人造代理讨论我们的偏好将实现价值一致性的核心目标。然而，今天，我们缺乏解释这种灵活和抽象语言使用的计算模型。为了应对这一挑战，我们考虑在线性强盗环境中考虑社会学习，并询问人类如何传达与行为的偏好（即奖励功能）。我们研究两种不同类型的语言：指令，提供有关所需政策的信息和描述，这些信息提供了有关奖励功能的信息。为了解释人类如何使用这些形式的语言，我们建议他们推理出已知和未知的未来状态：对当前的说明优化，同时描述对未来进行了推广。我们通过扩展奖励设计来考虑对国家的分配来形式化此选择。然后，我们定义了一种务实的听众，该代理人通过推理说话者如何表达自己来侵犯说话者的奖励功能。我们通过行为实验来验证我们的模型，表明（1）我们的说话者模型预测了自发的人类行为，并且（2）我们的务实的听众能够恢复其奖励功能。最后，我们表明，在传统的强化学习环境中，务实的社会学习可以与个人学习相结合并加速。我们的发现表明，从更广泛的语言中的社会学习，特别是，扩大了该领域的目前对指示的关注，以包括从描述中学习 - 是一种有前途的价值一致性和强化学习的有前途的方法。

translated by 谷歌翻译

Estimating and Penalizing Induced Preference Shifts in Recommender Systems

Micah Carroll , Anca Dragan , Stuart Russell , Dylan Hadfield-Menell

分类：机器学习

2022-04-25

推荐系统（RS）向用户显示的内容会影响他们。 Therefore, when choosing a recommender to deploy, one is implicitly also choosing to induce specific internal states in users.更重要的是，通过长匹马优化培训的系统将有直接的激励措施来操纵用户：在这项工作中，我们专注于转移用户偏好的动力，因此他们更容易满足。我们认为 - 在部署之前 - 系统设计师应：估计推荐人会引起的转变；评估这种转变是否是不受欢迎的；也许甚至可以积极优化以避免有问题的转变。这些步骤涉及两种具有挑战性的成分：估算需要预测假设算法如何影响用户偏好，如果部署 - 我们通过使用历史用户交互数据来训练隐含其偏好动态的预测用户模型来实现此操作；评估和优化另外需要指标来评估这种影响是操纵还是其他不必要的 - 我们使用“安全转移”的概念，该概念定义了行为安全的信任区域：例如，用户无需移动的自然方式而无需使用系统的干扰可以被视为“安全”。在模拟实验中，我们表明我们学习的偏好动力学模型可有效估计用户偏好以及它们如何对新推荐人的反应。此外，我们表明，在信托区域中优化的推荐人可以避免在仍在产生参与的同时避免操纵行为。

translated by 谷歌翻译

Guided Imitation of Task and Motion Planning

Michael James McDonald , Dylan Hadfield-Menell

分类：机器人 | 人工智能 | 机器学习

2021-12-06

虽然现代政策优化方法可以从感官数据进行复杂的操作，但他们对延长时间的地平线和多个子目标的问题挣扎。另一方面，任务和运动计划（夯实）方法规模缩放到长视野，但它们是计算昂贵的并且需要精确跟踪世界状态。我们提出了一种借鉴两种方法的方法：我们训练一项政策来模仿夯实求解器的输出。这产生了一种前馈策略，可以从感官数据完成多步任务。首先，我们构建一个异步分布式夯实求解器，可以快速产生足够的监督数据以进行模仿学习。然后，我们提出了一种分层策略架构，让我们使用部分训练的控制策略来加速夯实求解器。在具有7-自由度的机器人操纵任务中，部分训练有素的策略将规划所需的时间减少到2.6倍。在这些任务中，我们可以学习一个解决方案4对象拣选任务88％的策略从对象姿态观测和解决机器人9目标基准79％从RGB图像的时间（取平均值）跨越9个不同的任务）。

translated by 谷歌翻译

Learning Latent Representations to Co-Adapt to Humans

Sagar Parekh , Dylan P. Losey

分类：机器人 | 人工智能 | 机器学习

2022-12-19

When robots interact with humans in homes, roads, or factories the human's behavior often changes in response to the robot. Non-stationary humans are challenging for robot learners: actions the robot has learned to coordinate with the original human may fail after the human adapts to the robot. In this paper we introduce an algorithmic formalism that enables robots (i.e., ego agents) to co-adapt alongside dynamic humans (i.e., other agents) using only the robot's low-level states, actions, and rewards. A core challenge is that humans not only react to the robot's behavior, but the way in which humans react inevitably changes both over time and between users. To deal with this challenge, our insight is that -- instead of building an exact model of the human -- robots can learn and reason over high-level representations of the human's policy and policy dynamics. Applying this insight we develop RILI: Robustly Influencing Latent Intent. RILI first embeds low-level robot observations into predictions of the human's latent strategy and strategy dynamics. Next, RILI harnesses these predictions to select actions that influence the adaptive human towards advantageous, high reward behaviors over repeated interactions. We demonstrate that -- given RILI's measured performance with users sampled from an underlying distribution -- we can probabilistically bound RILI's expected performance across new humans sampled from the same distribution. Our simulated experiments compare RILI to state-of-the-art representation and reinforcement learning baselines, and show that RILI better learns to coordinate with imperfect, noisy, and time-varying agents. Finally, we conduct two user studies where RILI co-adapts alongside actual humans in a game of tag and a tower-building task. See videos of our user studies here: https://youtu.be/WYGO5amDXbQ

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

Improving self-supervised representation learning via sequential adversarial masking

Dylan Sam , Min Bai , Tristan McKinney , Li Erran Li

分类：计算机视觉 | 机器学习

2022-12-16

Recent methods in self-supervised learning have demonstrated that masking-based pretext tasks extend beyond NLP, serving as useful pretraining objectives in computer vision. However, existing approaches apply random or ad hoc masking strategies that limit the difficulty of the reconstruction task and, consequently, the strength of the learnt representations. We improve upon current state-of-the-art work in learning adversarial masks by proposing a new framework that generates masks in a sequential fashion with different constraints on the adversary. This leads to improvements in performance on various downstream tasks, such as classification on ImageNet100, STL10, and CIFAR10/100 and segmentation on Pascal VOC. Our results further demonstrate the promising capabilities of masking-based approaches for SSL in computer vision.

translated by 谷歌翻译